<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
            "http://www.w3.org/TR/REC-html40/loose.dtd">
<HTML>
<HEAD>



<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<META name="GENERATOR" content="hevea 1.08">
<LINK rel="stylesheet" type="text/css" href="umsroot.css">
<TITLE>
Writing Efficient Code
</TITLE>
</HEAD>
<BODY >
<A HREF="umsroot034.html"><IMG SRC ="previous_motif.gif" ALT="Previous"></A>
<A HREF="umsroot028.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A>
<A HREF="umsroot036.html"><IMG SRC ="next_motif.gif" ALT="Next"></A>
<HR>

<H2 CLASS="section"><A NAME="htoc83">6.7</A>&nbsp;&nbsp;Writing Efficient Code</H2>
<A NAME="secefficientcode"></A>
Even with a declarative language, there are certain
constructs which can be compiled more efficiently than others.
It is however not recommended to write unreadable code with the aim
of achieving faster execution - intuition is often wrong about which
particular construct will execute more efficiently in the end.
The advice is therefore <B>Try the simple and straightforward
solution first!</B> This will keep code maintainable, and will often be
as fast or marginally slower than elaborate tricks.
The second rule is to keep this original program even if you try
to optimise it. You may find out that the optimisation
was not worth the effort.
ECL<SUP><I>i</I></SUP>PS<SUP><I>e</I></SUP> provides some support for finding those program parts
that are worth optimizing.<BR>
<BR>
To achieve the maximum speed of your programs, chose the following compiler
options:
<UL CLASS="itemize"><LI CLASS="li-itemize">
debug:off
<LI CLASS="li-itemize">opt_level:1 (the default)
<LI CLASS="li-itemize">expand:on (the default)
</UL>
Some programs spend a lot of time in the garbage collection,
collecting the stacks and/or the dictionary.
If the space is known to be deallocated anyway, e.g. on failure,
the programs can be often sped up considerably
by switching the garbage collector off or by increasing
the <TT>gc_interval</TT> flag.
As the global stack expands automatically, this does not cause
any stack overflow, but it may of course exhaust the machine memory.<BR>
<BR>
When the program is running and its speed is still
not satisfactory, use the profiling tools.
The profiler can tell you which predicates
are the most expensive ones, and the statistics tool
tells you why.
A program may spend its time in a predicate because the predicate
itself is very time consuming, or because it was frequently executed.
The port profiling tool gives you this information.
It can also tell whether the predicate was slow because it
has created a choice point or because there was too much
backtracking due to bad indexing.<BR>
<BR>
One of the very important points is the selection
of the clause that matches the current call.
If there is only one clause that can potentially match,
the compiler is expected to recognise this and generate code
that will directly execute the right clause
instead of trying several subsequent clauses until the
matching one is found.
Unlike most of the current Prolog compilers, the ECL<SUP><I>i</I></SUP>PS<SUP><I>e</I></SUP>
compiler tries to base this selection (<I>indexing</I>) on the most suitable
argument of the predicate<SUP><A NAME="text3" HREF="umsroot028.html#note3">1</A></SUP>.
It is therefore not necessary to reorder the predicate
arguments so that the first one is the crucial argument
for indexing. For example, in a predicate like
<BLOCKQUOTE CLASS="quote">
<PRE CLASS="verbatim">
p(a, a) :- a.
p(b, a) :- b.
p(a, b) :- c.
p(d, b) :- d.
p(b, c) :- e.
</PRE></BLOCKQUOTE>
calls where the first argument is instantiated, like <TT>p(d,Y)</TT>, will be
indexed on the first argument, while calls where the second argument is
instantiated, like <TT>p(X,b)</TT>, will be indexed on the second.<BR>
<BR>
However, the decision is still based on only one argument at a time:
a call like <TT>p(d,b)</TT> will be indexed on the first argument only
(not because it is the first, but because it is more discriminating
than the second). If it is crucial that such a procedure is executed
as fast as possible with such a calling pattern, it can help to define
an auxiliary procedure which will be indexed on the other argument:
<BLOCKQUOTE CLASS="quote">
<PRE CLASS="verbatim">
p(X, a) :- pa(X).
p(X, b) :- pb(X).
p(b, c) :- e.

pa(a) :- a. pa(b) :- b.

pb(a) :- c. pb(d) :- d.
</PRE></BLOCKQUOTE>
The compiler also tries to use for indexing all type-testing information
that appears at the beginning of the clause body (or beginning of a disjunction):
<UL CLASS="itemize"><LI CLASS="li-itemize">Type testing predicates <A HREF="../bips/kernel/typetest/free-1.html"><B>free/1</B></A><A NAME="@default270"></A>, <A HREF="../bips/kernel/typetest/var-1.html"><B>var/1</B></A><A NAME="@default271"></A>, <A HREF="../bips/kernel/typetest/meta-1.html"><B>meta/1</B></A><A NAME="@default272"></A>,
<A HREF="../bips/kernel/typetest/atom-1.html"><B>atom/1</B></A><A NAME="@default273"></A>, <A HREF="../bips/kernel/typetest/integer-1.html"><B>integer/1</B></A><A NAME="@default274"></A>,
<A HREF="../bips/kernel/typetest/rational-1.html"><B>rational/1</B></A><A NAME="@default275"></A>,
<A HREF="../bips/kernel/typetest/float-1.html"><B>float/1</B></A><A NAME="@default276"></A>,
<A HREF="../bips/kernel/typetest/breal-1.html"><B>breal/1</B></A><A NAME="@default277"></A>,
<A HREF="../bips/kernel/typetest/real-1.html"><B>real/1</B></A><A NAME="@default278"></A>,
<A HREF="../bips/kernel/typetest/number-1.html"><B>number/1</B></A><A NAME="@default279"></A>,
<A HREF="../bips/kernel/typetest/string-1.html"><B>string/1</B></A><A NAME="@default280"></A>, <A HREF="../bips/kernel/typetest/atomic-1.html"><B>atomic/1</B></A><A NAME="@default281"></A>, <A HREF="../bips/kernel/typetest/compound-1.html"><B>compound/1</B></A><A NAME="@default282"></A>, <A HREF="../bips/kernel/typetest/nonvar-1.html"><B>nonvar/1</B></A><A NAME="@default283"></A> and
<A HREF="../bips/kernel/typetest/nonground-1.html"><B>nonground/1</B></A><A NAME="@default284"></A>.<BR>
<BR>
<LI CLASS="li-itemize">Explicit unification and value testing
<A HREF="../bips/kernel/termcomp/E-2.html"><B>=/2</B></A><A NAME="@default285"></A>, <A HREF="../bips/kernel/termcomp/EE-2.html"><B>==/2</B></A><A NAME="@default286"></A>, 
<A HREF="../bips/kernel/termcomp/REE-2.html"><B>\==/2</B></A><A NAME="@default287"></A> and <A HREF="../bips/kernel/termcomp/RE-2.html"><B>\=/2</B></A><A NAME="@default288"></A>.<BR>
<BR>
<LI CLASS="li-itemize">Combinations of tests with <A HREF="../bips/kernel/control/C-2.html"><B>,/2</B></A><A NAME="@default289"></A>, <A HREF="../bips/kernel/control/O-2.html"><B>;/2</B></A><A NAME="@default290"></A>,
<A HREF="../bips/kernel/control/not-1.html"><B>not/1</B></A><A NAME="@default291"></A>, <A HREF="../bips/kernel/control/-G-2.html"><B>&minus;&gt;/2</B></A><A NAME="@default292"></A>.<BR>
<BR>
<LI CLASS="li-itemize">A cut after the type tests.
</UL>
If the compiler can decide about the clause selection at compile time,
the type tests are never executed and thus they incur no overhead.
When the clauses are not disjoint because of the type tests, either a cut
after the test or more tests into the other clauses can be added.
For example, the following procedure will be recognised as deterministic
and all tests are optimised away:
<PRE CLASS="verbatim">
    % a procedure without cuts
    p(X) :- var(X), ...
    p(X) :- (atom(X); integer(X)), X \= [], ...
    p(X) :- nonvar(X), X = [_|_], ...
    p(X) :- nonvar(X), X = [], ...
</PRE>
Another example:
<PRE CLASS="verbatim">
    % A procedure with cuts
    p(X{_}) ?- !, ...
    p(X) :- var(X), !, ...
    p(X) :- integer(X), ...
    p(X) :- real(X), ...
    p([H|T]) :- ...
    p([]) :- ...
</PRE>
Here are some more hints for efficient coding with ECL<SUP><I>i</I></SUP>PS<SUP><I>e</I></SUP>:
<UL CLASS="itemize"><LI CLASS="li-itemize">Arguments which are repeated in the clause head and in the first
regular goal in the body do not require any data moving and thus
they do not cost anything. For example,
<BLOCKQUOTE CLASS="quote">
<PRE CLASS="verbatim">
p(X, Y, Z, T, U) :- q(X, Y, Z, T, U).
</PRE></BLOCKQUOTE>
is just as cheap as
<BLOCKQUOTE CLASS="quote">
<PRE CLASS="verbatim">
p :- q.
</PRE></BLOCKQUOTE>
On the other hand, switching arguments requires data moves and so
<BLOCKQUOTE CLASS="quote">
<PRE CLASS="verbatim">
p(A, B, C) :- q(B, C, A).
</PRE></BLOCKQUOTE>
is somewhat more expensive.<BR>
<BR>
<LI CLASS="li-itemize">When accessing an argument of a
structure whose functor is known, unification and 
<A HREF="../bips/kernel/termmanip/arg-3.html"><B>arg/3</B></A><A NAME="@default293"></A> are both similarly
efficient, so whether to write <TT>Struct=emp(_,X,_)</TT> or 
<TT>arg(2,Struct,X)</TT> is just a matter of taste and style.<BR>
<BR>
In eiter case, the structure notation (see section&nbsp;<A HREF="umsroot022.html#chapstruct">5.1</A>)
<A NAME="@default294"></A>
should be used, as it improves readability without adding any overhead.
So, for example <TT>Struct=emp{salary:X}</TT> or
<TT>arg(salary of emp,Struct,X)</TT>.<BR>
<BR>
<LI CLASS="li-itemize">Tests are generally rather slow unless they can be compiled away
(see <I>indexing</I>).
<BR>
<BR>
<LI CLASS="li-itemize">Waking is more expensive (due to the priority mechanism) than metacalling
which is more expensive than compiled calls.
Metacalls however do not carry as heavy a penalty as in 
some other Prolog systems.<BR>
<BR>
<LI CLASS="li-itemize">Sorting using <A HREF="../bips/kernel/termcomp/sort-2.html"><B>sort/2</B></A><A NAME="@default295"></A> is very efficient and it does not use
much space.
Using <A HREF="../bips/kernel/allsols/setof-3.html"><B>setof/3</B></A><A NAME="@default296"></A>, <A HREF="../bips/kernel/allsols/findall-3.html"><B>findall/3</B></A><A NAME="@default297"></A> etc. is also efficient enough
to be used every time a list of all solutions is needed.<BR>
<BR>
<LI CLASS="li-itemize"><A HREF="../bips/kernel/termcomp/E-2.html"><B>=/2</B></A><A NAME="@default298"></A> and
<A HREF="../bips/kernel/termcomp/EE-2.html"><B>==/2</B></A><A NAME="@default299"></A>
are faster than <A HREF="../bips/kernel/arithmetic/ENE-2.html"><B>=:=/2</B></A><A NAME="@default300"></A>.<BR>
<BR>
<LI CLASS="li-itemize"><A HREF="../bips/kernel/control/N-2.html"><B>:/2</B></A><A NAME="@default301"></A> is optimised away by the compiler
if both arguments are known.<BR>
<BR>
<LI CLASS="li-itemize">Starting from ECL<SUP><I>i</I></SUP>PS<SUP><I>e</I></SUP> 6.0, there is no performance difference between
using multiple clauses or using disjunction or if-then-else cascades.
In fact, the compiler normalises multiple clause predicates into
a single-clause representation with inline disjunctions.
Disjunctions are indexed.<BR>
<BR>
<LI CLASS="li-itemize">Conditionals with <B>&minus;&gt; ;</B> are compiled more efficiently
if the condition is an indexable built-in test.</UL>
<HR>
<A HREF="umsroot034.html"><IMG SRC ="previous_motif.gif" ALT="Previous"></A>
<A HREF="umsroot028.html"><IMG SRC ="contents_motif.gif" ALT="Up"></A>
<A HREF="umsroot036.html"><IMG SRC ="next_motif.gif" ALT="Next"></A>
</BODY>
</HTML>
